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Abstract 

This paper proposes a class of origin-smooth approximators of indicators under- 
lying the sum-of-negative-part statistic for testing multiple inequalities. The need 
for simulation or bootstrap to obtain test critical values is thereby obviated. A 
simple procedure is enabled using fixed critical values. The test is shown to have 
correct asymptotic size in the uniform sense that supremum finite-sample rejection 
probability over null-restricted data distributions tends asymptotically to nominal 
significance level. This applies under weak assumptions allowing for estimator co- 
variance singularity. The test is unbiased for a wide class of local alternatives. A 
new theorem establishes directions in which the test is locally most powerful. The 
proposed procedure is compared with predominant existing tests in structure, theory 
and simulation. 
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1 Introduction 



This paper is concerned with the problem of testing the null hypothesis Hq that the true value 
of a finite p-dimensional parameter vector /i is non-negative versus the alternative that at least 
one element of /i is strictly negative. A major problem for testing such hypotheses has been de- 
pendence of null rejection probability on the unknown subset of binding inequalities (zero- valued 
Hj). Under Hq, the asymptotic distribution of a nontrivial test statistic is typically degenerate 
at interior points (all elements of /i strictly positive) of parameter space. But at boundary points 
(one or more elements zero) , that distribution is non-degenerate and may depend on the number 
and position of the zero elements but not on strict positives. In consequence, determining the 
critical vahie to be used for the test at some nominal significance level a is a nontrivial issue. 
The classic least favorable configuration (LFC) approach seeks the parameter point in the null 
that maximizes the rejection probability (e.g., see Perlman (1969) and Robertson, Wright and 
Dykstra (1988)). This principle risks yielding tests which have comparatively low power against 
sequences of alternatives converging to boimdary points which are not LFC. To improve test 
power, recent literature has proposed using data-driven selection of the true binding inequalities 
in place of the LFC point to compute test critical values. Whatever the critical value, it is 
important to demonstrate that null rejection probability does not exceed a uniformly over all 
ifo-compliant data generating processes for sample size large enough. Such uniformity has been 
emphasized in recent literature (e.g., see Mikusheva (2007), Romano and Shaikh (2008), Andrews 
and Guggenberger (2009), Andrews and Soares (2010) and Linton et al. (2010)) to ensure validity 
of asymptotic approximation to actual finite sample test size especially when the test statistic 
has a limiting distribution which is discontinuous on parameter space. Regardless of whether 
the binding inequalities are fixed according to the LFC or determined via a stochastic selec- 
tion mechanism, the functional forms of test statistics proposed in this literature are generally 
non-smooth and hence computation of test critical values requires simulation or bootstrap. 

The contributions of the present paper are as follows. We develop a multiple inequality 
test whose implementation does not require computer intensive methods. The central idea is 
to construct a sequence of origin-smooth approximators of indicators underlying the sum-of- 
negative-part statistic for testing multiple inequalities. The approximation is a form of indicator 
smoothing in the spirit of Horowitz (1992), enabling standard asymptotic distribution results 
and obviating simulation and bootstrap computation of test critical values. Moreover, the test 
allows for estimator covariance singularity. 

The test statistic of this paper has a non-degenerate asymptotic distribution of simple analytic 
form at boundary points of the null hypothesis but becomes degenerate at interior points. Despite 
this type of discontinuity, the test critical value can be fixed ex ante without compromising 
asymptotic validity in the uniform sense that the limit of finite sample test size (defined as 
supremal rejection probability over all iJo-compatible data generating processes) is equal to the 
nominal size. We prove that this uniformity property holds for every approximator in a wide 
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class allowed by the paper. 

The smoothing design of this paper embodies a data driven weighting scheme which automati- 
cally concentrates the test statistic onto those parameter estimates signaling binding inequalities. 
This feature is connected to methods of binding inequality selection used in Hansen (2005), Cher- 
nozhukov et al. (2007), Andrews and Scares (2010) and Linton et al. (2010). Indeed, the smoother 
can also be interpreted as an asymptotic selector and the key component of our test statistic 
coincides with the sum of elements of the difference between the estimated and recentered null- 
compatible mean used to obtained the simulated test critical values for Andrews and Scares 
(2010) 's generalized moment selection (GMS) based tests. The difference itself, however, is not 
within the class of test statistics covered by the theory of these authors but its properties emerge 
from the theory developed in the present paper. 

The relative computational ease of the test of this paper might be expected to carry a cost 
in terms of power. However, as we show, the test is consistent against all fixed alternatives and 
is unbiased for a wide class of local alternatives. In comparison with existing tests, its relative 

strength varies with the particular direction of local alternative. We provide a new theorem 
establishing directions in which the test is locally most powerful. Monte Carlo results support 
the theory and reveal that finite sample performance of the present test is not dominated by the 
GMS based tests. 

We now review relevant test methods in addition to the works cited above. The QLR test 
has been well developed in the inequality test literature. See, e.g. Perlman (1969), Kodde 
and Palm (1986), Wolak (1987, 1988, 1989, 1991), Gourieroux and Monfort (1995, chapter 27) 

and Silvapulle and Sen (2005, chapters 3-4). This test is also applied in the moment inequality 
literature (sec Rosen (2008), Andrews and Guggcnbergcr (2009) and Andrews and Scares (2010)). 
The asymptotic null distribution of the QLR test statistic generally has no analytical form. Since 
computing this test statistic requires solution of a quadratic optimization program subject to 
non-negativity constraints, simulation and bootstrapping for the test critical value is particularly 
heavy. 

An extreme value (EV) form of test statistic was developed by White (2000) in the con- 
text of comparing predictive abilities among forecasting models. Such a statistic is lighter on 
computation but its asymptotic null distribution remains non-standard. Hansen (2005) incorpo- 
rates estimation of actual binding inequalities to bootstrap null distribution of the extreme value 
statistic. Hansen's refinement is a special case of the GMS based critical value estimation pro- 
posed by Andrews and Scares (2010) who also consider a broad class of test functions including 
both the QLR and other simpler forms using negative-part functions. 

The rest of the paper is organized as follows. Section 2 summarizes the method of Andrews 
and Scares (2010) for testing with estimated critical values which embody the GMS procedure 
for estimation of binding inequalities. We contrast that with the smoothing approach of this 
paper and highlight connecting features. Section 3 sets out functional assumptions on the class 
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of smoothers and completes construction of the test statistic. Section 4 states basic distribu- 
tional assumptions on parameter estimators and presents asymptotic null distribution of the test 
statistic. Section 5 establishes key results on asymptotic size of the test. Section 6 studies test 
consistency and local power. Section 7 presents results of some Monte Carlo simulation studies. 
Section 8 concludes. Appendix A derives the details of an adjustment component of the test 
statistic. Appendix B provides proofs of theoretical results of the paper. Appendix C gives 
examples of covariance matrix singularity and illustrates how they can fit into our framework. 

2 Recentering, Selection and Smoothing in Inequality Tests 

Let ^ = (/ii,/i2j ■■■ifJ-pY be a column vector of (functions of) parameters appearing in an econo- 
metric model. We are interested in testing : 

Hq : fij > for aU j e {1, 2, ...,p} versus Hi : ^ij < for at least one j. (2.1) 

We assume that there exists a vector of parameter estimators based on sample size T such 
that VT(fl ~ /i) is asymptotically multivariate normal with mean and covariance V consistently 
estimated by V. The vector /i and matrix V may depend on common parameters but this is 
generally kept implicit for notational simplicity. 

2.1 Recentering and Generalized Moment Selection in Critical Value 
Estimation 

Recent improved tests developed by Andrews and Scares (2010) of the hypothesis (|2.ip are distin- 
guished by their use of estimated critical values embodying a selection rule to statistically decide 
which inequalities are binding (/i^ = 0). In brief, these tests proceed operationally as follows. 
A statistic S{VT'jl, V) is first computed for some fixed function S{., .). The asymptotic critical 
value of the statistic is then obtained by simulation (or resampling) as the appropriate quantile 
of the distribution of S{Z + K{T)'jl, V) where Z is an artificially generated vector such that 
Z ^ N{0, V) conditionally on data, /i is a recentered null-compatible mean and K{T) = o(VT) 
is some positive "tuning" function increasing without bound as T — > oo. Basic recentering 
defines Jlj — for K(T)'p,j < 1. Setting Jlj — amounts to selecting j as the index of a binding 
constraint. For K{T)'Jlj > 1, Ji^ is defined to ensure K(T) flj — > cxd as T — > oo, this being 
simply achieved by taking Jlj = fl. Basic selection as stated here is a special case of the Andrews 
and Scares (2010) Generalized Moment Selection (GMS) procedure^ 

^Indeed, this selection rule corresponds to use of moment selection function i/p^^' considered by Andrews and 
Soares (2010, pp. 131-132) with due allowance for standardization of parameter estimates. See also Andrews and 
Barwick (2012, pp. 8-9) for various examples of the GMS selection rules. 
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Data-dependent selection of binding constraints reduces possible inefficiencies arising from 
fixing all the elements of Jl to be zero (least favorable) . On the other hand, regardless of how 
Jl is constructed, simulation (or bootstrap) is still needed since the asymptotic distribution of 
the statistic used in this literature is generally non-standard. This applies even to test statistics 
which aggregate individual discrepancy values min(/2j , 0) in a simple manner. They include the 
extreme value form studied by Hansen (2005) and the sum 

p 

J2[-VTmm{h,0)] (2.2) 
lying within the very wide class of right-tailed tests studied by Andrews and Soares (2010). 
2.2 The Smoothed Indicator Approach 

Let 1{.} denote the indicator taking value unity if the statement inside the bracket is true and 
zero otherwise. The root cause of non-standard distribution of (j2.2p is the discontinuity at the 
origin of the indicator l{x < 0} underlying the negative-part function min(a;, 0) — l{x < 0}x. 
To overcome this problem, the present paper investigates an indicator smoothing approach as 
follows. 

First, we approximate the function min(a;, 0) by ^Tix)x where {'^xix)} is a sequence of non- 
negative and non-increasing functions each of which is continuously differentiable at the origin 
and converges pointwise (except possibly at the origin) as T — > oo to the indicator function 
l{x < 0}. We refer to "ifrix) as an (origin-smoothed) indicator smoother or a smoothed indicator 
for l{x < 0}. 

In this paper, we will focus on the class of smoothed indicators generated as "^frix) — 
'i>{K{T)x) for some fixed function and a "tuner" K{T) of the type mentioned in Subsec- 
tion 12.11 The functional form of 5* includes decumulative distribution functions for continuous 
variates as well as discrete yet origin-smooth functions. We therefore replace the individual 
negative-part statistic VT mm{flj,0) of (|2.2p by\/r*T(Mj)Mj- Subject to regularity conditions 
set out later, '^T{pj) = Op{\/\/T) for strictly positive jjij and hence the term y/T'^T{Pj)V'j van- 
ishes asymptotically. For zero- valued '^T^Pj) tends to \l/(0) in probability and \/T'^T{Pj)V'j 
is asymptotically equivalent to '^{Q)\/T^y 



Second, we consider a left-tailed test based on the statistic that replaces p.2p with 

[Vr*T(M,)/i, - At(m„ %)J (2.3) 

where Vjj is the jth diagonal element of V and At is an adjustment term approximating the 
expectation of \^Tijij) — ^(0)]v^/i, evaluated at /i- = 0. This expectation is non-positive. 



5 



though shrinking to zero in large sampleso Under suitable regularity conditions At, whose 
detailed construction is given in Section [31 is non-positive for all T but converges to zero in 
probability. Hence, under the null hypothesis the statistic (j2.3p will be asymptotically either 
degenerate or equivalent in distribution to a normal variate and thus critical values for a test 
using (j2.3p will not require simulation. 

Besides indicator smoothing, it is also appropriate to view ^'^ as a form of binding inequal- 
ity selection akin to the aforementioned GMS procedure. The smoothed indicators in (|2.3p 
essentially embed a data driven weighting scheme which automatically concentrates the statistic 
(|2.3p onto those parameter estimates signaling binding inequalities. Indeed, consider the specific 
smoothed indicator constructed as \E'T(a;) = l{K(T)x < 1}. Such ^t{x) simply shifts the point 
of discontinuity away from the origin whilst still acting as a pure zero-one selector. Then the 
GMS based recentering described in Subsection l2 . II would amount to setting Jij — (1 — 
In this case, the statistic (12. 3p is equal to 



Since both and ]1 are available as a by-product of the mainstream tests of Subsection 12.11 one 
may as well perform a test on their difference. The asymptotic distribution of (12.41) does not 
itself require simulation and recentering, so there is no circularity of argument. Though (|2.4p 
and the GMS test procedure are closely related, it is important to stress that the present test 
enforces data driven selection of binding inequalities through smoothed indicators within the 
test statistic itself rather than at the stage of critical value estimation. Therefore, the class of 
statistics (12. 3p does not lie in the otherwise very wide class covered by the work of Andrews and 
Soares (2010). 

It is worth noting that the approach to achieve asymptotic normality in this paper is distinct 
from alternative devices such as those of Dykstra (1991) and Menzel (2008) who demonstrate 
that even the QLR statistic can be asymptotically normal when p, the dimension of /i, is viewed 
as increasing with T to infinity. Recent papers by Lee and Whang (2009) and Lee, Song and 
Whang (2011) obtain asymptotic normality for a class of functional inequality test statistics. 
Their particular device (poissonization) requires fi to be infinitely dimensional at the outset. 
By contrast, in the framework of testing finite and fixed p inequalities, the present paper (and 
its preliminary versions (Chen and Szroeter (2006, 2009) and Chen (2009, Chapter 3)) where a 
prototype asymptotically normal test statistic appears) uses only large T asymptotics and an 
indicator smoothing device. The strategy adopted by this work in testing is akin to Horowitz 
(1992) who sought to resolve non-standard asymptotic behavior in estimation by replacing a 
discrete indicator function with a smoothed version. Therefore, the smoothing mechanism in- 

^Note that '^'rijij)V-j < *(0)Mj for any T because the function ^t{^) = ^{K(T)x) is constructed to be 
non-negative and non-increasing in x. 



P 




(2.4) 
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vestigated by this paper to obtain standard asymptotic distribution results could also be of 
theoretical interest in its own right. 

3 Smoothed Indicator Class and Test Procedure 

We now formally set out regularity conditions on the smoothed indicator ^/xix), x € R. We 
require that 

\iJT{x) = '^{K{T)x) (3.1) 
where ^'(.) and K{T) are functions satisfying the following assumptions: 

[Al] "^{x) is a non-increasing function and < '^'{x) < 1 for x Cz R. 

[A2] ^'(0) > and, throughout some open interval containing a; = and at all except 
possibly a finite number of points outside that interval, '^'{x) has a continuous 
first derivative ipix) that is bounded absolutely by a finite positive constant. 
The left-hand limits of ipiy) as y approaches x exist at any x €z R. 

[A3] K{T) is positive and increasing in T. 

[A4] K{T) — > oo and K{T)/VT — > as T — ^ oo. 

[A5] ^'(a;) — > 1 as x — > ~oo. 

[A6] Vt^{K{T)x) — >0 as T — > oo for x > 0. 

Assumptions [Al]-[A6] are very mild and satisfied by all the particular 5* functions including 
step-at-unity, logistic and normal, discussed in Section 17.11 and used in the simulations of this 
paper. Assumption [A4] regulates the rate at which the "tuning" parameter K{T) can grow 
and, in the context of Andrews and Soares (2010) discussed in Subsection l2.11 enables consistent 
selection of binding constraints. Forms of tuning are also used by Chernozhukov et al. (2007) and 
Linton et al. (2010). [A2] enables smoothing for asymptotic normality through zero- valued /z^ , 
whilst [A6] creates data-driven importance weighting in the sense that each flj corresponding to 
strictly positive ji^ is likely to contribute ever less to the value of the test statistic as T increases. 
In consequence, the statistic will be asymptotically dominated by those /i^ corresponding to zero 
or negative /x^-, detection of which is the very purpose of the test. 

To implement the test, we have to construct the term Kt in (|2.3p of Subsection 12.21 Though 
Assumptions [A2], [A4] and (13.11) above are given so that, for = 0, \/T'^T{p-j)V-j in (|2.3p is 
asymptotically equivalent to ^{Q)y/Tfi^, the difference VT'^T{pj)V'j ~ ^{^)^/T^j remains non- 
positive in large samples. Whilst asymptotically negligible, this may be size-distorting in finite 
samples. To systematically offset that effect, the adjustment term At is constructed as follows 
to approximate the expectation of {^riPj) — ^(0)]VT/ij- 
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Under Assumption [A2], there are finite increasing values ai,...,a„ for some n > 1 such 
that "^f^x) is continuously differentiable in intervals (— oo, ai), (ai, 02), (a„, oo). Because ^' 

is bounded and non-increasing, its one-sided limits ^(a,^) = lim^ ^(x) and '^{a'^) = 

lim^ '^{x) for i E {1,2, ...,n} exist. Let ip{x), x E R he the "extended" derivative of ^ 

defined as the left-hand limit of ip{x). Namely, ipix) = \miy ipiu)- Then the algebraic form 

of At whose detailed derivation is given in Appendix A can be written as 



Arif^^d,,) = v,MK{T)7^,)K{T)/Vt V¥E(*K^) - "^(^m /^^,^J (3-2) 



where (j) is the standard normal density function. 



For the simple choice ^(x) = l{x < 1} used to form the statistic (j2.4L -0 = and there is a 
single discontinuity at a; = 1 so the proxy simplifies to 

At(Mj,%j) = -^/WAi P^r^frr^J - (3-3) 

On the other hand, for everywhere continuously differentiable — "0(2^) foi' x R and 

^'(0^^) ^'(a^) for iG{l,2,...,n}. Hence At for such case simplifies to 

At{71,,Vjj) = v,MK{T)7i,)K{T)/^/T. (3.4) 

Note that since ^ is non- increasing, for any T, AT{jlj,Vjj) given by p.2p is non-positive by 
construction. Besides, under Assumption [A4] AxifljjVjj) tends to zero in probability as T 
tends to infinity. Hence for those fXj 7^ 0, the impact of adjusting •\/T'^'t(Mj)Mj" with the term 
AT(jlj,Vjj) on test behavior is asymptotically negligible though the adjustment p.2p is applied 
for each j e {1, 2, 

Finally, we consider a further useful generalization by replacing each /ij in (|2.3|) with Oj^j for 
any positive scalar dj , which can be fixed known or estimated. Choosing 6j to be inverse of the 
estimated asymptotic standard deviation of 'flj amounts to conducting the test on t-ratios. Other 
choices of 9j are discussed in Appendix C which deals with estimator covariance singularity issues. 
With this enhancing feature, the adjustment term AT{pj,Vjj) is replaced by AT{9j'p.j, 9jVjj). We 
now present the test procedure as follows. 

Let 4*, A, Cp be the p dimensional column vectors and A be the diagonal matrix defined as 

$ = (*(if(T)?iMi),vI'(i^(r)?2/i2),...,*(i^(T)?p/ip))', (3.5) 

A = (AT(?iMi,?iWii),AT(?2M2,^2^^22),...,AT(?p/ip,^Xp))'' (3-6) 
ep = (1,1,...,!)', (3.7) 
A = diag{di,02,-,dp)- (3.8) 
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Let 



Qi 



Vr^'Afl - e'A 



(3.9) 



Q: 



'2 



(3.10) 



We define the test statistic as 



Q 



( 



^(QiM) ifQ2>0 

1 if g2 = 



(3.11) 



where $(a;) is the standard normal distribution function. For asymptotic significance level a, we 

reject Hq if Q < a. The test statistic Q is therefore a form of tail probability or p-value. 

We now sketch the reasoning which validates the test. Formal theorems are given later. 
Intuitively, we should reject Hq if Qi is too small. For those parameter points under Hq for which 
the probability limit of Q2 is nonzero, Q2 will be strictly positive with probability approaching 
one. Then the ratio Q1/Q2 will exist and be asymptotically normal. By contrast, for all points 
under Hi, the value of Qi will go in probability to minus infinity. Therefore, in cases where Q2 is 
positive, we propose to reject Hq if Q1/Q2 is too small compared with the normal distribution. 

Note that our assumptions on the smoothed indicators do not rule out discrete but origin- 
smooth ^ functions such as the step-at-unity example of Section Fy.ll For such a discrete function, 
\1/ will be a null vector with probability approaching one when all /i^, e {1, 2, are strictly 

positive. In this case, Q2 is also zero by (j3.10l) with probability approaching one. Therefore, 
occurrence of the event Q2 = is possible and signals that we should not reject Hq. Note 
that it is not an adhoc choice to set Q — 1 when Q2 — occurs because the probability limit 
of ^{Qi/Q2) is also one when all fi^ parameters are strictly positive and ^ is an everywhere 
positive functionlfl 

4 Distributional Assumptions and Asymptotic Null Dis- 
tribution 

We begin by stating the following high-level assumptions which enable us to derive some basic 
asymptotic properties of the test. Except for [D2], these assumptions are standard. 

Define A as the diagonal matrix A = diag{0i,92, ■■■,0p) where 9j is strictly positive and 
its estimator 9j is almost surely strictly positive for j G {l,2,...,p}. Let (i(/i) be defined as 
the p dimensional vector whose jth element equals 0, ^'(0), 1 when /i^ > 0, = 0, < 

^The case of ^ being everywhere positive is more complicated because Q2 can then be almost surely strictly 
positive. If all parameters are strictly positive, both numerator and denominator in the ratio Q1/Q2 tend to 
zero in probability. See Appendix B.4 for analysis of the asymptotic properties of the test statistic Q in that case. 
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respectively. For notational simplicity, we keep implicit the possible dependence of the true 
values of the parameters /z, V and A on the underlying data generating process. 

We assume that, as T tends to infinity, 

[Dl] Vt{^ - m) ^ ^(0, V) where V is some finite positive semi-definite matrix. 

The variance V need not be invertible but must satisfy the following condition (whose verification 
is illustrated in Appendix C). 

[D2] VAd{fi) ^ for non-zero d{p). 

Assumption [D2] amounts to saying that the asymptotic distribution oi^/T d{^)' lS.{jl ~ ^) should 
not be degenerate. 

[D3] V -^V for some almost surely positive semi-definite estimator V . 

[D4] A ^ A. 

Now let J denote the set {1, 2, and decompose this as J = A U Af U i?, where 

A = {J^J■.^iJ> 0}, M = {j e J : = 0}, B = {j e J : < 0}. 

Let J7(0, 1) denote a scalar random variable that is uniformly distributed in the interval [0, 1]. 
We now present the asymptotic null distribution of the test statistic. 

Theorem 1 (Pointwise Asymptotic Null Distribution) Given [Al], [A2], [A3], [A4], [A6] 
with [Dl] - [D4-], the following are true under Hq : > for all j (z J with limits taken along 
T — > oo. 

(1) If M ^ 0, then Q U{0,1). 

(2) If M = 0, then Q ^ 1. 

Part (1) of this theorem reflects the fact that, for any fixed data generating process whose 
H value lies on the boundary of null hypothesis space, the distribution of the test statistic Q is 
asymptotically non-degenerate and given p. lip , the limiting distribution of the ratio Q1/Q2 is 
standard normal. This justifies the idea of smoothing for normality. Moreover, Q has the same 
limiting distribution at each boundary point. Part (2) says that, at any fixed data generating 
process whose /i value lies in the interior of null hypothesis space, the asymptotic distribution of 
Q is degenerate and Q will take value above a with probability tending to 1. 
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5 Asymptotic Test Size 



5.1 Pointwise and Uniform Asymptotic Control of Test Size 

Theorem [T] shows that the test statistic Q is not asymptotically pivotal since its limiting distri- 
bution and hence the asymptotic null rejection probability depend on the true value of /i. By 
definition, the pointwise asymptotic size of the test is the supremum of the asymptotic rejection 
probability viewed as a function of /i on the domain defined by Hq . So Theorem [T] implies that 
this size equals the nominal level a and hence the test is asymptotically exact in the pointwise 
sense. However, pointwise asymptotic exactness is a weak property. It is desirable to ensure the 
convergence of the test size to the nominal level holds uniformly over the null-restricted param- 
eter and data distribution spaces. In this section we present results showing that the test size is 
asymptotically exact in the uniform sense. 

To distinguish between pointwise and uniform modes of analysis, we need some additional 
notation. Note that parameters such as ^ and V are functionals of the underlying data gen- 
erating distribution. Suppose the data consist of i.i.d. vectors Xt [t — 1,...,T) drawn from a 
joint distribution G. We henceforth use the notation Pg(-) to make explicit the dependence of 
probability on G. Let F denote the set of all possible G compatible with prior knowledge or 
presumed specification of the data generating process. Then Assumptions [Dl] - [D4] amount to 
restrictions characterizing the class F. Let Fq be the subset of F that satisfies the null hypothesis. 
In the present test procedure, "Q < a" is synonymous with "Q rejects Hq" . Hence, the rejection 
probability of the test is Pg{Q < a) and the finite sample test size is sup^gp^^ Pg{Q < ct)- 

Though Theorem [T] implies that convergence of rejection probability is not uniform over 
G € To, the test can be shown to be uniformly asymptotically level a (Lehmann and Romano 
(2005, p. 422)) in the sense that 



Inequality (|5.1|) and Part (1) of Theorem [T] together imply the test size is asymptotically exact 
in the uniform sense that 



The property (|5.2[) is important for the asymptotic size to be a good approximation to the finite- 
sample size of the testo Such uniformity property has been emphasized in recent literature 
(e.g., see Mikusheva (2007), Romano and Shaikh (2008), Andrews and Guggenberger (2009) 
and Andrews and Soares (2010)) particularly when limit behavior of the test statistic can be 
discontinuous. Accordingly, we establish the validity of (j5.2l) in Theorem [21 

*Note that the notion of asymptotic test size using limsupy toe ™PGeTg ^g{Q < stronger than its 

pointwise version sup^gp^^ limsupj^ Pg{Q < o;)- See Lehmann and Romano (2005, p. 422) for an illustrating 

example in which pointwise asymptotic size can be a very poor approximation to the finite sample test size. 



limsup sup Pg{Q < a) < a. 

T — >oo GeFo 



(5.1) 



limsup sup Pg{Q < a) — a. 

T — >oo GeTo 



(5.2) 



11 



Before presenting the formal regularity conditions ensuring (j5.2p . we explain here how (I5.2p 
is possible despite asymptotic non-pivotality of the test statistic. First note that by p.lip . 

PG{Q<a)<PG{Qi-z^Q2<0) (5.3) 

where is the a quantile of the standard normal distribution. The transformed statistic {Qi — 
ZaQ2) is still not asymptotically pivotal but it can be shown that, given any arbitrary sufficiently 
small (relative to model constants) positive scalar 77, we have with probability at least (1 — 77) 
for all sufficiently large T that 

Qi - ZaQ2 > r'rpVTCli ~ fi) - {zaC2{i]) + Ci{r])) ^ r'rpVrT 

where ry, /i and V are non-stochastic G-dependent quantities such that either rr = or r'rpVrx 
is bounded away from zero over G G Fq, whilst 01(77) and 02(77) are non-stochastic functions that 
do not depend on G and 01(77) — > and 02(7/) — > 1 as 77 — > 0. Therefore, 

Pg{Qi - ZaQ2 < 0) < PGir'rVTill - fi) < (z„C2(77) + cii-q)) ^ r'^V tt) + 77 (5.4) 

whose right hand will tend, uniformly over G giving non-zero rj^, to <l'(2:c(C2(77)-|-ci(77))-|-77 which 
is also automatically a weak upper bound on (j5.4p for the case = 0. This uniformly valid 
probability bound therefore applies to (j5.3p for arbitrarily small 77 hence implies that (j5.ip holds. 
Equality is obtained by invoking Theorem [T] which says a is actually attained as the limit of 
Pg{Q < 01) evaluated at any fixed G G Fq whose /i has at least a zero- valued element. 

The explanation provided above is indicative but short of a formal proof. In the next sub- 
section we present additional "uniform" assumptions, strengthening the existing "pointwise" 
assumptions [Dl] - [D4] of Section |H that are needed to make the argument rigorous. The full 
proof, along with examples to illustrate some of the assumptions, will be found in the Appendix 
B. 

5.2 Uniform Asymptotic Exactness of Test Size 

In this section we rigorously address the issue of asymptotic exactness of test size in the uniform 
sense given by (|5.2p . For this purpose, we strengthen Assumptions [Dl] - [D4] by the following 
Assumptions [Ul] - [U4] where objects such as K{T) have already been defined in Assumptions 
[Al] - [A6]. Define the vector Y and the scalar 5t as 

Y = VT{ri-fi), 5t^\Ik{T)/^. 

Note that Assumption [A4] implies that 5t — > as T — > 00. For any matrix 777,, let ||77i|l = 
max{|777,y |} where 777,.^ denotes the {i,j)-t\\ element of 771. 
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Assumption [Ul] : For any finite scalar value r/ > 0, 



lim inf Pg{6t\\Y\\ <r], ||V"-Vg|| < ??) = 1. 

Assumption [U2] : Let $(.) denote the standard normal distribution function. Then given 
any finite scalar c, 

lim sup sup |Pg(/3"^ < c) - $(c)| =0. (5.5) 

T — >oo Ggro /3:/3'Vb/9=l 

To illustrate how the high-level Assumptions [Ul] and [U2] may be verified, consider the 
leading example where fi and V are the sample mean and variance of i.i.d. random vectors Xt^ 
(t = 1,2, ...,r) with joint distribution gH Then the simple but not necessarily the weakest 
primitive condition guaranteeing both Assumptions [Ul] and [U2] is that the first four moments 
of every element of Xt exist and are bounded uniformly over G G Fq. This condition allows the 
application of the Chebychev inequality to components of the right-hand side of the inequality 

PG{^T\\Y\\<r]A\y -VgW <^)>PGi3T\\y\\<^)^PG{\\y-VG\\ <ry)-l 

to deduce that Assumption [Ul] holds. To verify Assumption [U2] we first note that, by Lemma 
4 proved in the Appendix, it is sufficient for (j5.5p that 



T 



lim |Pgt(/3t^ < c) - $(c)| =0 (5.6) 



for all non-stochastic sequences (Gt,/3t) satisfying Gt € Tq and P'j^VgtPt = 1- By the i.i.d. 
assumption, P'^pY is 1/Vt times the sum of T variates /3'rp{xt — EQ,j.{xt)) which are mutually 
i.i.d. with mean and variance 1 for each T when ^'j^VctPt — 1- This meets the requirements 
of the double array version of the classic Lindeberg- Feller central limit theorem thus establishing 
asymptotic unit normality of 13'j-Y hence verifying (|5.6p . 

For the next assumption, recall that 6j is the jih diagonal element of the matrix A. For 
notational simplicity, the general dependence of 9j and A on G will be kept implicit. 

Assumption [U3] : (i) There are finite positive scalars A and A' such that X' < 0j < A, 
(j = 1, 2, uniformly over G G Fp. (ii) For any finite scalar value rj > 0, 

lim inf Pg( A- A <nST) = l- 

T — s-ooGGFo 



Assumption [U3] holds automatically when A is numerically specified by the user hence 
A = A. It also allows 9j to be l/y/vJJ where vjj is the jth diagonal element of Vg provided that 

^This simple average framework is used extensively in recent literature on inference for (unconditional) moment 
inequality models. See, e.g. Chernozhukov et al. (2007), Romano and Shaikh (2008), Rosen (2008), Andrewrs and 
Guggenberger (2009), Andrews and Scares (2010), Andrews and Barwick (2012) and references cited therein. 
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is bounded below by some constant, say L > 0, uniformly over G G In such case, 

when \vjj — Vjj\ < Hence in the sample mean example described after Assumption [U2], 

we can verify [U3]-(ii) by applying the Chebychev inequality to show that Pg(|%j ~ < V^t) 
also tends to 1 uniformly over G G Fq. 

For any given positive scalar tr, let d^{fi) denote the p dimensional vector whose jth element 
equals \I'(0) when < fij < a and equals otherwise. 

Assumption [U4] : There are finite positive real scalars lo, uj' and a such that the following 
hold uniformly over G G Fq : (i) \\Vg\\ < (ii) d^{fiy AVGAda-{p) > uj' for all non-zero da(fi). 

Assumption [U4]-(i) is simply a boundedness assumption which automatically holds when 
Vg is a correlation matrix. [U4]-(ii) holds automatically when the smallest eigenvalue of Vg is 
bounded away from zero over G G Fq. Note that [U4]-(ii), essentially strengthening Assumption 
[D2], requires that the asymptotic variance of VTd^^p,)' ACjl — jj) be bounded away from zero 
for all non-zero d^{fi). This is a high level assumption whose verification will be illustrated in 
examples of Appendix C. 

We can now present the following theorem establishing asymptotic exactness of the test in 
the uniform sense. 

Theorem 2 (Uniform Asymptotic Exactness of Test Size) Given Assumptions [Dl] - [D4], 
suppose Assumptions [Ul] - [U4] also hold. Assume some G G Fq has fi value containing at 
least one zero-valued element. Then under Assumptions [AlJ, [A2], [A3], [A4-J, [A6] and given 
< a < 1/2, 

limsup sup Pg{Q < a) — a. 

T — >oo GeFo 

6 Asymptotic Power of the Test 

In this section, we study the asymptotic power properties of the test. Proof of all results are 
presented in the Appendix. For notational simplicity, we suppress the dependence of probability 
and parameters on the underlying data generating distribution. We first show that the test is 
consistent against fixed alternative hypotheses. 



® Assumption [U3]-(ii) is stronger than requiring consistency of 8j as an estimator of 9j. An alternative 
approach is to strengthen Assumption [U2] by taking Y to be \/T(/Sfl — A/i) rather than just VT{'}1 — But 
that would be implicitly assuming \/T(9j — 9j) is asymptotically normal (or degenerate). Such an assumption is 
even stronger than [U3]-{ii) and quite unnecessary for our results. 

''By mean value expansion, — 6j | = \vjj — vjj | /{2\vjj\^^^) where Vjj lies between Vjj and Vjj . Thus when 
\vjj — Vjj\ < L/2, inequality 115.711 follows by noting that \vjj — Vjj\ < \vjj — Vjj\. 
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Theorem 3 (Consistency) Given [Al] - [A6] with [Dl] - [D4], the following is true under 
Hi : fij < for some j G {1, 2, 

P{Q <a) — ^ 1 as T — > oo. 

Besides consistency, we are also interested in the local behavior of the test. In order to derive 
a local power function, we consider a sequence of /i values in the alternative-hypothesis space 
tending at rate T~^/^ to a value 7 = (71, 72, 7p)' on the boundary of the null-hypothesis 
space. Specifically, we represent the jth element of /i of such a local sequence as 

where 7^ > and Cj are constants such that 7^ = and Cj < hold simultaneously for at least 
one j. The sequence (|6.1I) is said to be core if Cj < holds in every instance of 7^ = 0. A 
core local sequence corresponds to Neyman-Pitman drift in the original sense (McManus (1991)) 
whereby parameter values conflicting with the null hypothesis are imagined ceteris paribus to 
draw ever closer to compliance as T increases. In the easily-visualized case p = 2, all points on 
the boundary of null-restricted space are limits of core sequences. Non-core sequences can only 
converge to the origin, a single point compared to the continuum of the full boundary. We may 
now state : 

Theorem 4 (Local Pov^rer) Assume [Al], [A2], [A3], [A4], [A6] and [Dl], [D3], [D4] hold 
with the elements fi^ of 11 taking the T- dependent forms as specified by id. 1\) . Define 

i=i 

« ^ EEl{7. = 0}l{7, =0}^.^,z;., 
i=i j=i 

where Vij denotes the {i,j)-th element of variance matrix V. Assume k > 0. Then, as T — > 00, 

P(Q < a) ^ $(z„ - «:-i/V), (6.2) 

where Za is the a quantile of the standard normal distribution. 

Theorem m implies that the test has power exceeding size against all core sequences because 
the composite drift parameter t is necessarily negative for such local scenarios. By contrast, 
tests based on LFC critical values can be biased against core local sequences tending to boundary 
points off the origin. This is easily seen for statistics such as EV and QLR which are continuous 
in their arguments. In such cases, local power under any core sequence (|6.ip tends to rejection 
probability at the boundary point /i = (71, 721 7p)'- Unless this point is the LFC itself. 
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rejection probability there will be smaller than that at any LFC point by definition. Hence the 
LFC critical value based test is biased against core local alternatives. A similar argument is 
given in Hansen (2003, 2005). 

Against non-core local sequences, our test can be biased because a trade-off comes into force 
between negative and positive Cj as Theorem 2] shows. Some degree of local bias is common 
in multivariate one-sided tests and exists even in GMS procedures using estimated rather than 
LFC test critical values, as noted by Andrews and Soares (2010, p. 146, comment (vi)). However, 
the exact local direction at which a test exhibits strength or weakness may vary across tests. 
Therefore, different tests are complementary rather than competing. To obtain a formal result, we 
consider a local sequence converging to the origin, namely 7^ = for j € {1, 2, Let c denote 

the vector (ci, C2, Cp)'. Under such a local scenario, the GMS procedure will asymptotically 
treat all inequalities as binding in the critical value calculation. Thus the asymptotic distribution 
of the statistic S{\/T'jl, V) of Subsection 12.11 is the same as that of S{Z + c, V) and the test 
rejection probability tends to 

P{S{Z + c,V)>q^) (6.3) 

where Qa is the (1— a) quantile of S{Z, V) under Z ^ N{0, V). We now present a theorem showing 
that the test of this paper is locally most powerful for a non-empty subclass of directions. Let 9 
denote the vector of diagonal elements of the matrix A. 

Theorem 5 Suppose the variance matrix V is positive definite and 7^ = for j € {1, 2, in 
the local sequence i6.1\) . Then for every testing function S{., .) such that P{S{Z, V) > Qa) — a 
under Z ^ iV(0, V^), the asymptotic local power in \6.'^] is at least a and is not smaller than 
\6. S\] when c = —SVO for any positive scalar S. 

Depending on the off-diagonal elements of V, the local directions —6V9 can be for either core 
or non-core sequences^] Theorem [5] implies that along such local alternatives, the present test is 
not biased and its limiting local power is not dominated by those of existing tests based on GMS 
critical values. Note that the result of Theorem [5] does not require specification of particular 
functional forms of 5 (.,.). It is achieved by indirectly exploiting the Neyman- Pearson lemma. 
Some special forms are used in Section [7] for numerical illustration. 

7 Monte Carlo Simulation Studies 

In this section we conduct a series of Monte Carlo simulations to study the finite sample per- 
formance of the test. All tables of simulation results are placed together at the end of the 
section. 

*Note that the vector —SVO necessarily contains at least one negative element since V is positive definite, 9 is 
a positive vector and 5 is a postive scalar. 
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7.1 The Specification of Smoothed Indicator 

Our objective is to investigate how well the asymptotic theory of the test works in finite sample 
simulations. For this purpose, we choose ^' functions which are simple, recognized and not 
contrived. It would be premature at this stage to undertake a more elaborate exercise to find an 
optimal combination of 'ii{x) and K{T). 

For the specification of ^, the following functions are heuristic choices that are widely adopted 
in research on smoothed threshold crossing models. 

Normal : '^Nor{x) = 1 — ^(a;) 
Logistic : 5'Log(2;) = (1 + cxp(a;))^^ 

Besides '^not and Log, the following simple choice of mentioned in Section [2?2l is also valid. 

Step-at-unity : ^stepix) = l{x < 1} 

As regards the choice of K{T), the following two specifications closely match tuning parameters 
used in recent literature on inference of moment inequality models (See e.g. Chernozhukov et 
al. (2007) and Andrews and Soares (2010)). These choices are 

SIC : Ksic{T) = VT/\og{T) 

LIL : Klil{T) = v/r/(2 loglog(T)) 

The first name reflects a connection with the Schwarz Information Criterion (SIC) for model 
selection and the second with the Law of the Iterated Logarithm (LIL). 

7.2 The Simulation Setup 

The simulation experiments are designed as follows. We choose a nominal test size of a = 0.05. 
We use R — 10000 replications for simulated rejection probabilities. In each replication, we 
generate i.i.d. observations {xt}JLi with T = 250 according to the following scheme : 

xt=fi + V^^^wt (7.1) 

where Wt is a p dimensional random vector whose elements are i.i.d. from distribution G^. 

We compute fi and V as the sample average and sample variance of the generated data. We 
take the scalars 6j — ^/vJj and 9j = ^/ ^/vjj where Vjj and Vjj are the jth diagonal elements 
of V and V respectively. This simple simulation setup is also adopted by Andrews and Soares 
(2010) and Andrews and Barwick (2012) in simulation study of the CMS tests. For G^,, we 
consider three distributions: standard normal, logistic and U (—1, 2), the uniform distribution on 



17 



the interval [—1, 2]. All of these distributions are centered and scaled such that E{wt.j) = and 
Var{wt,j) — 1 for j e {1,2, Standard normality of Gw is the benchmark. The logistic 

distribution has thicker tails than the normal whilst the support of a uniform distributed random 
variate is bounded. The latter two distributions are included to assess the test performance under 
finite sample non-normality oi'fi. For comparison, we also conduct simulations using the following 
test statistics: 

S'l = -min{\/T^i/ii,yT?2A'2i---:V^p/ip,0}, 

52 = min T(/i — — /i), 

fi:fj,>0 
P 

53 = ^(min{VT^,M^.,0})2, 

p 

Si = ^[-Vrmin(?j-Mj,0)]. 

The extreme value form S'l is essentially Hansen (2005) 's test statistic appropriated for testing 
multiple non-negativity hypotheses. 5*2 is the classic QLR test statistic. S3 is the modified- 
method-of-moments (MMM) statistic considered in the literature of moment inequality models 
(see, e.g. Chernozhukov et al. (2007), Romano and Shaikh (2008), Andrews and Guggenberger 
(2009) and Andrews and Soares (2010)). S4 is the raw sum-of-negative-part statistic which can 
be transformed by smoothing into the key component of the test of the present paper. 

The critical values for tests based on S'l to S4 are estimated using bootstrap coupled with the 
GMS procedure of the elementwise t-test type as suggested by Andrews and Soares (2010) and 
Andrews and Barwick (2012). We use 10000 bootstrap repetitions for calculation of the GMS 
test critical values. The tuning parameter in the GMS procedure is set to be the SIC or LIL type 
(Andrews and Soares (2010, p. 131)). For ease of reference, let Sj{SIC) and Sj{LIL) denote the 
GMS test using statistic Sj with tuning SIC and LIL respectively. Furthermore, let Q{'i^,K) 
denote the present test implemented with its smoothed indicator specified by 5* and K . 

We consider simulation scenarios based on p e {4, 6, 10}. For multivariate simulation design, 
we have to be more selective on the specifications of /i and V parameters of (j7.ip . Concerning 
the fj, vector, we follow a design similar to that previously employed by Hansen (2005, p. 373) 
in simulation study of the test size performance. To be specific, /i is the p dimensional vector 
given by 

Ml = 0, M, - Hj - l)/{p - 1) for p > J > 2 

where A € {0,0.25,0.5}. Note that the A values are introduced to control the extent to which 
inequalities satisfying the null hypothesis are in fact non-binding. Regarding the variance matrix 
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V, we set y to be a Toeplitz matrix with elements Vij = for j > i, where p g {0, —0.5, 0.5}. 
This greatly simplifies the specification for off-diagonal elements of V but still allows for presence 
of various degrees of both positive and negative correlations. 

For power studies, we consider the /x vector given by 

fj. = ~SVe + eji (7.2) 

where 6 G {0.15,0.1,0.05}, V is the variance matrix given as above, 9 = {61,62, Op)' , e G 
{0,0.5,0.8} and Jl is the vector with Jij — S ior 1 < j < p/2 and Jlj — ~S for p/2 < j < p- 
For e = 0, the design (|7.2p mimics the local direction as suggested by Theorem \5\ under which 
the test Ql'i'jK) is expected to outperform other tests. When e is non-zero, the local direction 
in favor of the present test is perturbed with another vector fl containing mixture of positive 
and negative elements. Such Jl may incur power trade-off in light of Theorem |4] and thus the 
perturbation parameter e controls the degree of deviation toward Ji and enables some sensitivity 
check of test power performance. 

7.3 Simulation results 

We report the simulated maximum null rejection probability (MNRP) and average power (AP) 
for each test. Given G^,, the maximization for the MNRP is over all Hq compatible combinations 
of fl and p values whilst given both Gw and e, the averaging for AP is over all Hi compatible p 
and p configurations. Table 1 lists the MNRP values in three block columns side by side for the 
three specifications of Gw The AP values generated by three e values are then listed separately 
for each G^ in Tables 2, 3 and 4. 

In Table 1, the primary interest is how close the MNRP values are to the nominal 5% signif- 
icance level, particularly in cases of over-rejecting. In that respect, we compare the percentage 
of values not exceeding 0.05, 0.055, 0.06, 0.065. These percentages are about 18, 51, 87, 96 for 
the 54 Q(*, K) values and 9, 52, 79, 94 for the 72 values of the GMS tests. Plainly, the g(^', K) 
test is no more prone to over-rejection than the GMS tests. A common feature across all tests 
is that over-rejection tends to increase with p. However, only 2 out of 54 Q{^,K) entries and 
4 out of 72 GMS entries exceed 0.065. These excesses amount to less than 5% of a table of 126 
simulated entries. 

We now examine the sensitivity of MNRP to the underlying data generating distribution Gw ■ 
For all tests. Table 1 exhibits little systematic difference attributable to the three different spec- 
ifications of Gw ■ These figures suggest that the MNRP results are not sensitive to finite sample 
non-normality. Furthermore, for each test, regardless of G^, Table 1 suggests that use of SIG 
type tuner in place of the LIL can yield better control of test size. This finding is consistent with 
the simulation studies of Andrews and Soares (2010, pp. 149-152) demonstrating that the SIG 
tuner tends to give better MNRP properties. Overall, step^ Ksic) and Qi'^ Log^ Ksic) have 
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better MNRP results among the class of K) tests and their size performance is comparable 
to that of the four SIC tuned GMS tests. 

We now turn to Tables 2, 3, 4 giving AP results of the tests. For the unperturbed direction 
(e = 0), Theorem [5] of Section [5] indicates that the test is locally more powerful than 

the GMS tests considered in the simulations. Along such local direction, irrespective of the 
underlying G^,, the simulation results indicate that the Q{'i',K) tests dominate the GMS tests 
in AP performance. The GMS QLR test (5*2) is not far behind. Hansen's test (5*1), which is 
arguably the most stable in terms of MNRP performance, has distinctly lower power. But it is 
still a good performer. For the perturbed directions (e G {0.5, 0.8}), while the Q(vE', K) tests still 
outperform the 5*1 tests, they do not generally dominate other versions of the GMS tests but the 
AP differences are not large. 

We comment on the comparative performance of the Q{'i>,K) tests with the 5*4 tests. Their 
comparison is of particular interest since the present test essentially attempts to smooth the 
statistic 54. The smoothed version is less costly in computation because its critical value is 
obtained without resampling. We compare S4{SIC) with Qi"^ step, Ksic) and Qi"^ Log, Ksic)- 
The simulation results suggest that the Qi"^ step, Ksic) and Qi"^ Log, Ksic) tests have similar 
degree of size control as S4{SIC). Against the alternative hypothesis, Qi'i'Log, Ksic) has slightly 
larger power than Si{SIC) in all 27 cases while step, Ksic) outperforms S4{SIC) in 18 out 
of the 27 cases. These findings suggest that implementational advantage of the present test based 
on smoothing does not appear to be achieved at the cost of test performance. 

Perusing all the other entries in Tables 2, 3, 4, it seems that the different variants of the 
Q{'i',K) test perform quite similarly to one another retaining power well in excess of 0.73 
throughout. What these results illustrate is that the Q{'i',K) test has identifiable directions 
of strength as indicated theoretically by this paper. Given the simulation results above, the 
Qi'^Step, Ksic) and Q{^Log, Ksic) tests work at least as well as other Q(vE', K) versions exam- 
ined here but have better size performance. Hence while Ksic is the preferred tuner, both ^step 
and ^'iog are the recommended smoothers. 
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Table 1 : Simulated Maximum Null Rejection Probability for T = 250 



DGP Gu> 


iV(0,l) 


Logistic 


t/(-l,2) 


Number of inequalities 


4 


6 


10 


4 


6 


10 


4 


6 


10 


Qi^Step, Ksic) 


.049 


.056 


.055 


.052 


.054 


.056 


.051 


.052 


.055 


Q{^Log,Ksic) 


.046 


.053 


.055 


.046 


.054 


.057 


.048 


.052 


.058 


Nor, Ksic) 


.050 


.059 


.061 


.050 


.058 


.063 


.050 


.056 


.063 


Qi^Step,KLIL) 


.051 


.059 


.059 


.053 


.056 


.059 


.051 


.053 


.057 


Q{^Log,KLIL) 


.049 


.056 


.057 


.048 


.057 


.060 


.048 


.053 


.059 


Q{'i>Nor,KLIL) 


.054 


.062 


.065 


.052 


.059 


.066 


.053 


.058 


.066 


Sl{SIC) 


.050 


.052 


.054 


.049 


.052 


.053 


.051 


.052 


.053 


Si{SIC) 


.050 


.054 


.053 


.052 


.055 


.054 


.050 


.050 


.054 


SsiSIC) 


.050 


.056 


.052 


.050 


.051 


.057 


.052 


.052 


.056 


5*4 (5/C) 


.051 


.058 


.054 


.053 


.054 


.057 


.052 


.055 


.058 


Si{LIL) 


.053 


.055 


.055 


.051 


.054 


.056 


.054 


.054 


.056 


S2{LIL) 


.058 


.061 


.061 


.059 


.063 


.063 


.058 


.058 


.061 


Ss{LIL) 


.056 


.061 


.057 


.055 


.058 


.065 


.058 


.058 


.064 


Si{LIL) 


.059 


.068 


.066 


.060 


.064 


.070 


.061 


.065 


.070 


Table 2 : Simulated Average Power for T 


= 250, 


G-w = 


= iV(0,l 


) 






e = 


e 


= 0.5 




e 


= 0.8 




Number of inequalities 


4 


6 


10 


4 


6 


10 


4 


6 


10 


Q{^step, Ksic) 


.770 


.837 


.900 


.773 


.840 


.904 


.783 


.849 


.909 


Log, Ksic) 


.754 


.827 


.893 


.783 


.849 


.910 


.813 


.872 


.927 


Nor, Ksic) 


.741 


.814 


.882 


.780 


.845 


.906 


.817 


.875 


.928 


Qi'l'Step,KLIL) 


.752 


.822 


.886 


.761 


.830 


.895 


.780 


.847 


.906 


Q{^Log,KLIL) 


.748 


.821 


.888 


.781 


.847 


.908 


.815 


.874 


.928 


Qi^Nor, KlIl) 


.734 


.807 


.875 


.778 


.844 


.903 


.819 


.876 


.928 


SiiSIC) 


.593 


.626 


.650 


.699 


.728 


.761 


.774 


.803 


.831 


S2iSIC) 


.714 


.781 


.847 


.784 


.844 


.901 


.834 


.887 


.937 


SsiSIC) 


.678 


.735 


.793 


.750 


.804 


.858 


.805 


.854 


.899 


SiiSIC) 


.730 


.794 


.855 


.767 


.830 


.886 


.808 


.864 


.913 


Si{LIL) 


.594 


.626 


.650 


.700 


.729 


.762 


.776 


.805 


.832 


S2{LIL) 


.716 


.782 


.848 


.785 


.846 


.903 


.836 


.889 


.939 


SsiLIL) 


.678 


.736 


.794 


.751 


.805 


.860 


.808 


.856 


.902 


Si{LIL) 


.732 


.795 


.857 


.769 


.833 


.889 


.811 


.868 


.916 
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Table 3 : Simulated Average Power for T = 250, G^, = Logistic 





e = 


e 


= 0.5 




e = 0.8 


Number of inequalities 


4 


6 


10 


4 


6 


10 


4 


6 


10 


Qi^Step, Ksic) 


.772 


.839 


.900 


.774 


.841 


.903 


.781 


.850 


.910 


Q{^Log,Ksic) 


.757 


.828 


.893 


.785 


.851 


.910 


.813 


.875 


.929 


Nor, Ksic) 


.744 


.815 


.882 


.781 


.847 


.906 


.817 


.878 


.930 


Qi^Step,KLIL) 


.753 


.824 


.886 


.763 


.831 


.894 


.779 


.848 


.908 


Q{^Log,KLIL) 


.751 


.823 


.888 


.783 


.849 


.908 


.815 


.876 


.930 


Q{'i>Nor,KLIL) 


.738 


.808 


.874 


.780 


.845 


.904 


.819 


.878 


.930 


Sl{SIC) 


.599 


.629 


.651 


.697 


.729 


.762 


.775 


.803 


.831 


Si{SIC) 


.718 


.782 


.847 


.784 


.845 


.901 


.834 


.889 


.938 


SsiSIC) 


.681 


.737 


.794 


.750 


.803 


.858 


.806 


.855 


.901 


5*4 (5/C) 


.734 


.795 


.854 


.768 


.830 


.886 


.807 


.866 


.915 


Si{LIL) 


.600 


.629 


.651 


.699 


.730 


.763 


.777 


.805 


.833 


S2{LIL) 


.719 


.784 


.849 


.786 


.846 


.903 


.837 


.891 


.940 


Ss{LIL) 


.682 


.738 


.796 


.751 


.805 


.861 


.808 


.857 


.903 


Si{LIL) 


.735 


.797 


.856 


.771 


.833 


.889 


.811 


.869 


.919 


Table 4 : Simulated Average Power for T - 


= 250, Gy, = 


C/(-l,2) 






e = 


e 


= 0.5 




e = 0.8 


Number of inequalities 


4 


6 


10 


4 


6 


10 


4 


6 


10 


Q{^step, Ksic) 


.769 


.837 


.899 


.775 


.842 


.902 


.782 


.849 


.908 


Log, Ksic) 


.754 


.826 


.892 


.785 


.850 


.910 


.812 


.874 


.926 


Nor, Ksic) 


.741 


.813 


.880 


.781 


.846 


.906 


.817 


.876 


.927 


Qi'l'Step,KLIL) 


.752 


.821 


.885 


.763 


.832 


.894 


.779 


.847 


.907 


Q{^Log,KLIL) 


.749 


.820 


.886 


.784 


.848 


.908 


.815 


.876 


.927 


Qi^Nor, KlIl) 


.735 


.806 


.873 


.780 


.844 


.903 


.819 


.878 


.928 


SiiSIC) 


.594 


.623 


.652 


.698 


.727 


.758 


.773 


.801 


.830 


S2iSIC) 


.715 


.778 


.846 


.784 


.843 


.900 


.834 


.887 


.937 


SsiSIC) 


.678 


.733 


.793 


.749 


.803 


.858 


.805 


.854 


.899 


SiiSIC) 


.730 


.793 


.852 


.768 


.831 


.886 


.807 


.866 


.914 


Si{LIL) 


.594 


.623 


.652 


.699 


.728 


.759 


.775 


.803 


.831 


S2{LIL) 


.716 


.780 


.848 


.785 


.845 


.902 


.836 


.889 


.939 


SsiLIL) 


.679 


.734 


.794 


.751 


.805 


.860 


.807 


.857 


.901 


Si{LIL) 


.731 


.794 


.853 


.770 


.833 


.889 


.811 


.869 


.918 
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8 Conclusions 



This paper develops a test of multiple inequality hypotheses whose implementation does not re- 
quire compiitationally intensive procedures. The test is based on origin-smooth approximation of 
indicators underlying the sum-of-ncgative-part statistic. This yields a simply structured statistic 
whose asymptotic distribution, whenever non-degenerate, is normal under the null hypothesis. 
Hence test critical values can be fixed ex ante and are essentially based on the unit normal 
distribution. Moreover, the test is applicable under weak assumptions allowing for estimator 
covariance singularity. 

We have proved that the size of the test is asymptotically exact in the uniform sense. The 

test is consistent against all fixed alternative hypotheses. We have derived a local power function 
and used it to demonstrate that the test is unbiased against a wide class of local alternatives. 
We have also provided a new theoretical result pinpointing directions of alternatives for which 
the test is locally most powerful. 

We have performed simulations which illustrate the potential of the test to be of practical 
inferential value along with simplicity and speed. These simulations, carried out for a range of 
p values, also shed light on the choice of smoothed indicator. They suggest that when coupled 
with the SIC type tuner, both the logistic and the step-at-unity smoothers perform well in finite 
samples. These are the recommended choices for test implementation. The simulation study 
also compares the test of this paper with several different tests which estimate critical values 
using the GMS procedure. We find that the test appears to be a viable complement to the GMS 
critical value estimation methodology. 
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A Supplementary Derivation of Krijlj^Vjj) 

The term Axi^ijjVjj) acts as an approximation for the expectation of [^^(/ij) — '^{0)]\/Tflj 
evaluated at fij = 0. Under regularity condition [Dl], when /i^ — 0, the distribution of Vt^j for 
T sufficiently large is approximately normal with mean zero and variance vjj . Let X denote any 
scalar random variable distributed as N{0,c). Define /it = K{T)/^/T. Given p.f p . AT{'flj,Vjj) 
is thus constructed to approximate E{(^i{hTX) - *(0))X) = E{'ii{hTX)X) with c = Vjj. In 
what follows, we take as read the notation and definitions stated between equations p.ip and 
(1321). 

Define uq = — oo and a„+i = oo. Let (jj denote the standard normal density function. Note 
that 



E{'${hTX)X) 
= ^ / '^{hTx)x(j){x/^/c)/^/cdx 

/-ai/hT 



^ / hT^{hTx)ci>{x/v~c)dx - ^(*K-) - ^{arm-r-^] 

n 



(A.l) 
(A.2) 



where (jA.ip follows from integration by parts and re-arrangement of terms in the sum and (|X2|) 
follows by using [A2] which implies il^i^) — i^i^) almost everywhere. Taking c — Vjj and plugging 
in the parameter estimates, we hence construct Kxijlj^Vjj) as 



At(m,,%) ^ v,,i^{K{T)rL^)K{T)l./T- 7% ^(*(a-) - ^{a+m 



)• (A.3) 



We now comment on the derivative term in the expression (|A.3I) . Since Ht goes to zero as 
T increases, E{ip{hTX)) tends to '0(0) by Assumption [A2] and the Dominated Convergence 
Theorem. The limit value ip{0) also coincides with the probability limit of ip{K(T)'p,j) for the 
case fij = 0. Hence, we use 'ilj{K{T)'p,j) instead of E{tp{hTX)) to account for the slope effectlfl 
thus allowing the derivative term to depend on the estimate /i^ . This has the advantage that for 
non-zero valued /i^ , ip{K{T)flj) itself also tends to zero and hence yields faster convergence of A^ 

to zero when the function 5* further has the properties of lima; !--oo ''Pix) — lim^; j.oo ip{x) — 0. 

Specifications of ^' satisfying these properties are numerous, including the logistic and the normal 
smoothers given in Section 17.11 



^By taking X ~ N{0,c) with c = Vjj, E{ip{hj'X)) can be computed using numerical integral as 

/OC y y 

-oo 
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B Proofs of Theoretical Results 

The section presents proofs of aU theoretical results stated in the paper. Proofs of Theorems 1, 
3, 4 and 5 (pointwise asymptotics and local power) along with preliminary Lemmas 1, 2 and 3 
are presented in Subsections IB.ll - IB. 71 Proofs of Lemma 4 providing a sufficient condition for 
Assumption [U2] and Theorem 2 (uniform asymptotics) are given separately in Subsections IB.8I 
and IB. 91 of the Appendix. 

Recall that J denotes the set {1, 2, and the sets A, M, and B are defined as 

A={jeJ:ti^> 0}, M = {jeJ: /i, = 0}, B = {j G J : fi^ < 0}. 

B.l Probability Limits of the Smoothed Indicator 

We first prove a lemma that states the probability limits of the smoothed indicator '^T{&jV-j)^ 
which will be referred to in the proofs of some theorems in this paper. 

Lemma 1 (Probability Limits of the Smoothed Indicator ) 

Assume [Dl] and [D^]. Then the following results are valid as T — > oo. 

(1) If j eA and [Al], [A3], [A6] hold, then VT^riOjIlj) ^ 0. 

(2) // j e M and [A2], [A4] hold, then *t(^jMj) ^'(0). 

(3) If j eB and [Al], [A3], [A5] hold, then "^TiOfij) 1. 

Proof. To show part (1), for e > and for 77 > 0, we want to find some T{e, r/) > such that 

for T > T{e,-n), 

P{VT^T{efi^) < e) > 1 - 77. 

By [Dl] and [D4], we have 9.j[lj — ^ ^'jMj, which is strictly positive for j e A. Then there is a 
Tiiji) such that for T > Ti{r]), 

P(0,M^./2 < e,[l^ < 3e,fi^/2) >l-7j. 

Therefore, by [Al] and [A3] we have 

l~V < P(*t(30,/^,/2) < *T(^,/i,) < *T(e,/i,/2)) 

where the first inequality follows because is a non-increasing function. [A6] implies that 
^/T^T{djtJ.j/2) — ^ as T — ^00. Therefore, there is some T2{£) such that for T > T2(e), 
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VT'i/Tidji^j/i) < e. Combining all these results, part (1) in this lemma follows by choosing 
T(£,?7) =.max(ri(?7),T2(e)). 

To show part (2), note that If j G M, by [Dl] and [D4], we have VTOj^j = Op{l). By [A4], 
K{T)/VT ^ o(l) so that K(T)djflj 0. By [A2], ^ is continuous at origin. Therefore, part 
(2) follows from the application of the continuous mapping theorem. 

To show part (3), for e > and for 77 > 0, we want to find some T{e,ri) > such that for 

r> r(£,7?), 

P(l - e < *T(^iMj) <l + £)>l-V- 
Following the proof given in part (1), we have that there is a Ti{ri) such that for T > Ti{rj) 

1-7? < P(6lj/ij/2 < < 36lj/ij/2) 

< P(*t(3%M,/2) < ^T{e,T^,) < *T(e,Mj/2)). 

Note that if j e B, then 9j^l^ < and thus by [A5], ^'t(6'jMj/2) — > 1 and *T(36'j/ij/2) — > 1. 
Then there is some Tsie) such that for T > r3(£), ^'t(6'jA/j/2) < 1 + e and *T(36'jAfj/2) > 1 - £. 
Therefore, part (3) follows by choosing T(£, 77) — max(Ti(?7), T3(£)). ■ 

B.2 Asymptotic Properties of y/T^!T{0jJij)6jJij 

Based on Lemma [TJ we derive the asymptotic properties of the components corresponding to 
j G A, e M, j e i? of the sum j \/T'^T{&jV'j)SjV'j- The results are stated in the 
following lemma. 

Lemma 2 (Asymptotic Properties of ^/T'i/T{OjJij)Oj'jij) 

Let Vjj denote the jth diagonal element of V. Assume [Dl] and fD4-J. Then the following 
results are valid as T — > 00. 

(i) If j eA and [AlJ, [A3], [A6] hold, then s/T^T(Ofij)OjT^j 0. 

(u) // j e M and [A2], [A4] hold, then VT'^T0j]lj)dj]lj N{0, {^'{0)9j)^Vjj). 

(ui) If j e B and [Al], [A3], [A5] hold, then VT^T{Oj]lj)dj]lj -00. 

Proof. Note that part (i) follows from [Dl], [D4] and part (1) of Lemma [1] To show part (ii), 
by [Dl] and [D4], if j G M, we have that Vrdjpi^ N(Q,e]v Therefore, part (ii) follows 
by applying part (2) of Lemma [1] To show part (iii), note that for j G B, 

Vt^t{0j11j)Oj11, = ^T{dj]i,)VTd,{^^ - M,) + *T(^,M,)Vr0,M,- (B.l) 

Therefore, part (iii) follows from the fact that by [Dl], [D4] and part (3) of Lemma [1] the first 
term on the right hand side of (|B.ip is Op{l) and the second term goes to —00 in probability. ■ 



28 



B.3 Asymptotic Properties of A'r{0jfij,6jV 



The foUowing lemma states the asymptotic properties of the adjustment term AT{0j'jlj,9jVjj) 
defined by 



Lemma 3 (Asymptotic Properties of KT{0j'jlj^9jVjj)) 

Assume [Al], [A2], [A4], [D3] and [D4]. Then for j £ J, At(?jMj,?j%) 0. 
Proof. By [Al] and [A2] and the properties of standard normal density function, we find that 



^2_ K(T) 



T 



a. 



where 6>f denotes the finite positive bound on the derivative of given in Assumption [A2]. 
Note that [A2] also implies of > for each i. By [A4], [D3] and [D4], the right-hand side of the 
inequality above is Op(l) and thus Lemma [3] follows. ■ 

B.4 Proof of Theorem 1 

Proof of part (1) : 

By Lemma [3] and under Hq, the quantity Qi may be written as 

which, by part (i) of Lemma [2l is asymptotically equivalent in probability to merely 

which, by [Dl], [D2], [D4] and part (2) of Lemma [TJ is asymptotically normal with mean zero 
and strictly positive variance equal to ^(0)^a;M where lum = d'j^.jAV AcIm in which dM denotes 
the p dimensional vector whose jth element is unity for j G M but zero for j ^ M. Using similar 
arguments along with [D3], we also find that 



From these results about Qi and Q2 and the definition (jS.lip of Q, we conclude that Q equals 
to ^{Qi/Q2) with probability tending to 1 as T — !• 00 and thus Q U{0, 1). 

Proof of part (2) : 
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When M is empty yet Hq holds, only the sums taken for j € A remain in the definitions of 
Qi and Q2 hence the following analysis is confined to j G A. We distinguish between smoothed 
indicators which are such that '^'rix) = for all T sufficiently large when x > and smoothed 
indicators such that '^'rix) remains strictly positive for x > for all T. In the former case, part 
(1) of Lemma [T] implies that P{^T{6jT''j) = 0) — > 1 for j e A and hence P{Q2 = 0) — > 1 and 
thus P{Q = 1) — ^ 1. 

Now we consider the latter case where ^t{x) > for x > regardless of T. This happens for 
everywhere positive \I/ functions. Then the quantity Tj = 9 j^!T{0 jfij) is almost surely strictly 
positive for all j G A. By eigenvalue theory, for all T, 



Q2 < /Amax V T ■ < VpArnaxmax{f (B.2) 

where Amax is the largest eigenvalue of V. Note that (jB.2p holds even if Q2 — 0, which under 
current scenario could only happen because of singularity of V and V. However, when P{Q2 — 
0) — > 1, we have P{Q = 1) — > 1 and thus part (2) of the theorem follows. 

Note that for j £ J, equation (|3.2[) and Assumptions [Al] and [A2] imply that the term 
Kt{0 j'jljjO jVjj) is non-positive for all T. Hence, since all /i^ are positive by supposition, as T 
— > 00, by p.9p we have that 



Qi > max{T,| minjVT/i.l. 



jeA jeA 

with probability tending to 1. Because the mapping from a positive semi-definite matrix to its 
maximum eigenvalue is continuous on the space of such matrices, by [D3] we have Amax Amax 
where Amax is the largest eigenvalue of V. By [D2], < Amax < 00 and thus we have 



Q1/Q2 > minjVT/i }/\/pAmax 
7"eA ' 



jGA 

with probability tending to 1 as T — > 00. Since VTjlj goes to infinity as T — > 00 for j e A, it 
follows that Q ^ $(Qi/g2) ^ 1. 

B.5 Proof of Theorem 3 

Since rejection of Hq occurs if Q < a for the test statistic p. lip , it suffices for consistency to show 
that under Hi, Q2 goes in probability to some positive constant and Qi goes to minus infinity 
as T — > 00. By (13. 5p and Lemma [1] the probability limit of under Hi is the p dimensional 
vector whose jth element is < 0} -I- ^'(0)l{/ij = 0}]. Therefore, by [D3] and [D4] 



Q2 = V^'AVA^ ^ ^/d^Iy^vM{p), 
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which is strictly positive by the regularity condition [D2]. On the other hand, Lemma [2] implies 
that y/T'^T{(ijV'i)(^ jl^i is bounded in probability for j G J\B but tends to negative infinity for 
j e B. Furthermore, Lemma [3] implies that AT{Oj^j,OjVjj) = Op{l) for j G J. Under Hi, B is 
non-empty and thus Q1/Q2 goes to —00 in probability and hence P(Q < a) — > 1 as T — > co . 

B.6 Proof of Theorem 4 

Under the assumed form of local sequence (|6.ip , for all j we have 

K{T)e,tij = iK{T)/VT)9j[VT{^^ - /.^.) + c,] + K{T)9,j^ 

where 7^ > 0. In the case 7^ — 0, Assumptions [A4], [Dl] and [D4] imply that K{T)6j^j — ^ as 
T — > 00 . By [A2] and the continuous mapping theorem, this then implies that '^{K{T)9j^j) 
^'(0). On the other hand, if > 0, (|6.ip implies that there is some S > such that iij > 
7j- - (5 > for aU T sufficiently large. So under [Al], [A3], [A6], [Dl] and [D4], we have that 
^/T'^T{dj'(lj)9jfij by using arguments closely matching the proof of part (1) of Lemma[T] 

Therefore, from these results and by (|6.1|) . [Dl], [D4] and Lemma [3l Qi is asymptotically 
equivalent in probability to 

p 
i=i 

and thus has an asymptotic normal distribution with mean ^(0)r and variance 5'(0)^k. Using 
similar arguments, it is straightforward to see that Q2 *(0)V^. Therefore, Q1/Q2 ^ 
N{k~^^^t, 1) from which the assertion of Theorem 2] follows. 

B.7 Proof of Theorem 5 

We shall establish that for any non-zero vector c, 

$(z„ + Vc'V-^c) > P{S{Z + c,V)> qa,) {B.3) 

holds for every testing function S{., .) such that P{S{Z, V) > Qa) = a under Z ^ N{0, V). The 
theorem then follows by noting that the left-hand side of (jB.3p when c — —SVd coincides with 
the power function (j6.2l) under the local direction specified by the theorem. 

To show (jB.3p . consider an imaginary situation where X is the observable random vector 
that is distributed as Z -I- Hx where Z ^ N{0,V). For given V, a simple application of the 
Neyman-Pearson lemma (Lehmann and Romano (2005, p. 60, Theorem 3.2.1)) implies that a 
most powerful test at level a of the simple null hypothesis /ij^ — versus the simple alternative 
l.ix = c is to reject the null if and only if —c'V^^X/y/c'V^^c < z^- Hence (jB.3p holds by 
noting that such test has power equal to $(zct + V dV^^c) which is therefore not smaller than 
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P{S{Z + c,V) > Qa), the power of another test at level a which rejects the null hypothesis 
= if and only if S{X, V) > qa- 

B.8 Sufficient Condition for Assumption [U2] 

The following lemma provides a sufficient condition for Assumption [U2] of Section [51 Recall 
that Y EE VT{fl~ n). 

Lemma 4 Assumption [U2] holds provided that given any finite scalar c, 

lim \PgAP'tY < c) - ^{c)\ =0 (B.4) 

T — ^oo 

for any sequence {Gt,Pt) satisfying Gt G Fq and /J^Vg^/Sj^ = 1. 
Proof. Let 

/t(G,/3) = |Pg(/3V<c)-$(c)|. 

Let S denote the set {(G,/?) : G e Fo,/? G I](G)} where the set I](G) = {/3 e i?^ : P'VgP = 1}. 
Note that 

sup sup /t(G,/3)= sup /t(G,/3). (B.5) 

GGro,3eS(G) (G,/3)es 

Since for any e > 0, there is a pair (Gt(£), /3t(s)) ^ such that 

sup /t(G,/3) < /j,(Gt(£),/3t(£)) +e, 

(G,/3)6S 

Assumption (jB.4|) used with equality (IB.Sp implies 

lim sup sup /T(G,/3)<e. 

Hence Assumption [U2] follows by noting that e is arbitrary chosen and /t > 0. ■ 

B.9 Proof of Theorem 2 

We aim to establish the inequality 

lim sup sup Pg[Q < a) < a. (B.6) 

T — !-c3o GeTo 

Then Theorem [2] follows by combining together the results implied by (IB.6P and Part (1) of 
Theorem [TJ 
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Let Za be the a quantile of the standard normal distribution. The test rejects the null 
hypothesis if and only if > and Qi — ZaQ2 < 0. Therefore, 

PG(reject H^) < Pg{Qi - z^Q^ < 0). (B.7) 

The strategy of the proof is to demonstrate that Pg{Qi — ZaQ2 < 0) is asymptotically bounded 
by the nominal size a uniformly for all G satisfying the null hypothesis. That then validates 
(|R6)) via (|R7)) . Note that -z^ > for < a < 1/2 as used in this theorem. By (jX^ . ([XTU)) 
and non-positivity of the term, we have 

p p 
i=i j=i 

where Vij and Vij are the elements of V and Vg, respectively. For notational simplicity, the 
dependence of fi and on G is kept implicit. 

Now we give details of the proof. For ease of presentation, they are organized in the following 
headed subsections. 

1. Lower Bound for the Difference (Qi — ZaQ2) 
Let St = \/k{T)/VT. For any 77 > 0, define the set 

RTi^l) = {j:0< K{T)ii,j < 27^5t}. 

We show that, with probability tending to 1 uniformly over G G Fq as T — > 00, 

Ql - ZaQ2 > Qi.Rt - ZaQ2.RT (B-8) 

where 

E E *(^(T)?./i.)^(^(T)?,7l,m^.,. 

ieRrif^) jeRrit^) 

We follow the convention that summation over an empty set yields value zero. Note that (jB.8|) 
automatically holds when i?T(M) — {1j 2, ...,p}. For Rt^^) being a proper subset of {1, 2, 




Q 



2,Rt 



\ 
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we rely on the fact (proved in the next subsection) that, with probabihty tending to 1 uniformly 
over G G To as T — > oo, 

K{T)-ll^ > tj6t for j ^ RrifJ.) (B.9) 

and, for RrifJ-) nonempty, 

Q2,R^ > > (B.IO) 

where uj' is the constant defined in Assumption [U4]-(ii). Let m be any index such that m ^ Rxip-) 
and OmUm — ^fij for all j ^ RrifJ-)- Since ^' is non- negative, (|B.9p implies 



(B.ll) 



Furthermore, by [Al] the function \l/ is non- increasing and < 1. Thus, (|B.9p and (|B.10p 
together imply 



\Q2,R^ - Q2I < \QIh^ - Ql\ /Q2,B.r < P^^iK{T)0ml2J 



A 



V 



y/2/u 



(B.12) 



Given that — Zq, > 0, when RrilJ-) is empty, (jB.ll[) alone implies (|B.8[) . With RrifJ-) non- 
empty, (|BlT1) and (|RT2|) together imply (|R8)) provided 



A 



(B.13) 



We show that under the null hypothesis, (IB.9|) . (IB.lOp and (|B.13I) will indeed hold for r/ small 
enough and T large enough (yielding St small enough by Assumption [A4]) under the key event 



EJ^ described next. 



2. The Key Event i?^ and Lower Bound for the DifTerence {Qi.Rt ~ ^aQ2,R.T^ 
Let Yj be the jth element oiY = \/T{Ji — fi). For 7] > 0, define the event 



E} = {6t\\Y\\<7j,\\V-Vg\\ <v, 



A- A 



which holds with probability tending to 1 uniformly over G € Fq as T — > 00 by Assumptions 
[A4], [Ul] and [U3]-(ii). Since K{T)'jlj = K{T)^j,j + S^Yj, under the nuh hypothesis the event 
EJj^ implies the inequality (jB.9p . To show that the event EJf also implies (jB.10[) and (jB.13p . 
and then derive the key result (|B.18I) of this subsection, we first need to draw out the following 
inequalities (IRT4)) - ([BTt)) . 

Note that when < K{T)fj,j < 2t]6t, we have that by Assumption [U3]-(i) and under the 
event EJj^, 



(B.14) 
(B.15) 
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By Assumption [A2], ^(x) is differentiable on \x\ < Sr/SxiX + 7]6t) for 77 small enough and T 
large enough. Therefore, given 4* < 1, the event Ej, and inequalities (|B.14p and (|B.15p imply 

that 

where b^sj denotes the bound on the derivative of ^'(a;) defined in Assumption [A2]. Hence, when 
r] < I and (5t < 1, we may certainly write 

where Ci is a fixed positive quantity given values of p, A and By Assumptions [U3]-(i) and 
[U4]-(i) and using similar arguments with ?y < 1 and 6t li we can obtain a bound for Q2 
under the event as the fonowing 

02,i?. >*(0)' E E ^^^.7%— (B.17) 
i6-R.T(M) jei?T(M) 

where C2 is fixed and positive given values of p, A, w, 5^ and ^'(0). 

We can choose -q to satisfy 77 < min{l,a;'/(2C2)} and choose T such that 2-q5T / K{T) < cr, 

where a is the constant defined in Assumption [U4] by which the right-hand side of (IB.17P is 

larger than lo' /2 and hence inequality (jB.lOp is satisfied. Using Assumptions [U3]-(i) and [U4]-(i), 

^ ^ 2 ^ 

under the event £'^, we see 6* „j > A' — (Jt?? whilst A V < {\+Stti)'^{i^ +'>])■ Since 5^, — ^00 

by Assumption [A4], given 77 > 0, (jB.131) will indeed hold for large enough T . Finally, let r-r 

denote the p dimensional vector whose jth element is 6j ii j € RTifJ-) and zero, otherwise. Then 

given that — > and with r/ small enough and T large enough, ([RTHI) and ((BAT)) together 

imply 



Qi,Rt - ZaQ2,R^ > ^{Oy^Y - CiTj - z^^^{0)^r'^VGrT - C2V- (B.18) 
3. The Probability Bounds 

We have shown above how occurrence of the event El^ implies the inequality (IB.SP given 77 small 
enough and T large enough. Hence 

Pg{Qi ~ z^Q2 < 0) < 1-Pg{E:I,) + Pg{Qi-z^Q2<0,e:^, 



< 1 - PaiE'r},) + Pg{QiMt ~ ZaQ2MT < 0) (B.19) 



where the last term of (IB.19P is zero when RrifJ-) is empty. For non-empty Rt{h), using (jB.lSp 
yields 



Pg{Qi,Rt - Zc.Q2,Rr < 0) < Pcir'TY - z^Jr'^VarT - C2r//*(0)2 < Ci?7/^'(0)). (B.20) 
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The probability in the right-hand side of (jB.20[) may be written as 



PgW'tY < ZaC2,Rr + vCi,Rt) (B.21) 

where 




Note that by [U4]-(h), we have that with T large enough, < Cx^r^ < Ci/(*(0)Vw') and 
■y/l — C2?//w' < C2.R.T — 1- Hence, given Za < and small enough, the probability (jB.21|) 
cannot exceed 

PgW'tY <z^^l- C2ri/uj' + Cir//(*(0)%/^)). (B.22) 

Given the fact that /3y is non-stochastic with /JlpVc/^T — Ij Assumption [U2] implies that 
given ry, for any ^ > 0, there is a threshold T*{rj,S) such that for T > T*(r/,^), the probability 
(|B.22I) will be smaller than 

Mz^^l-C2v/i^' + Cir7/(*(0)V^)) + e 

uniformly over all G obeying the null hypothesis. On the other hand, by Assumptions [A4], [Ul] 
and [U3]-(ii) applied to the event EJ^, for any e > 0, there is a threshold T**{'q,e) such that 
for T > T**{r],e), Pg[E^t) > 1 - £ uniformly over ah G obeying the null hypothesis. Putting 
together these facts and (|RT91) . ((R20)) . ((R22)) . we have that for T > max{r*(?7, 0, e)}, 

sup Pg{Qi~ z^Q2<i)) < - C2v/to' + Ci7,/(*(0) V^)) + e + £ 

Gero 

from which by letting T — > 00 in accordance with T > max{r*(r/, ^), r**(77, e)} as the scalars 
rj, ^ and e approach zero, it follows that linisupy sup^gp^ Pg{Qi ~ ZaQ2 < 0) < a. 

C Covariance Singularity Examples 

In this appendix section, we present three examples of estimator covariance singularity for which 
the high level assumptions [D2] and [U4]-(ii) are verified. Recall that G is the joint distribution 
from which the underlying individual data vector is randomly sampled. F is the set of all possible 
G compatible with presumed specification of the data generating process and Fq is the subset of 
F that satisfies the null hypothesis. All parameter values such as /i and V depend on the point 
G of evaluation but we keep that implicit to avoid notational clutter. 

In the first two examples, the econometric model is initially characterized by an r dimensional 
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vector of parameters (3 = /Jj, /^r)'- The restrictions being tested are synthesized into the 
one-sided form /i > with /i = (/ij^, /ij, /^p)' = Cfi + h where C is a known p x r matrix and h 
is a known p dimensional vector of constants. We assume an asymptoticahy normal estimator /3 
is available with non-singular asymptotic variance matrix i7. Since V = CQC" , V value induced 
by any G G F is necessarily singular when r < p. In the third example, we consider a different 
scenario where singularity arises only for some specific V values. 

Example 1: Triangle Restriction 

For a Cobb-Douglas production function with capital and labor elasticity coefficients (3i and /^j, 
the restrictions being tested /?! > 0, > and /3i + < I (non-increasing returns to scale) 
form a triangle for the graph of (/3]^, /Jj). Here r — 2, p — 3 and 

p = (/il,M2, Ms)' - (/3l:/32, 1 - /?! - /Ss)'- (C.l) 

Verification of [D2] and [U4]-(ii) : Note that V = CilC where is the variance matrix 
of the asymptotic distribution of VT{ /3 — /3) and 



C" 



1 -1 
1 -1 



C"Ad(/i) 



01 -03 
02 -03 



dip). 



We assume the primitive condition that the smallest eigenvalue of is bounded away from zero 
over all G G F. Assumption [D2] is true since C'Ad{p) being zero for non-zero d{jj.) would 
require all elements of d{iJ.) to be non-zero, in turn requiring all elements of p given by (|C.1I) to 
be negative or zero, which is impossible. For Assumption [U4]-(ii), we note that for sufficiently 
small a, the only non-zero values for d^rip) possible under the null hypothesis are \E'(0) multiples 
of (1,0,0)', (0,1,0)', (0,0,1)', (1,1,0)', (1,0,1)', (0,1,1)', because it is not possible for more 
than two of the elements of p to simultaneously lie between and cr < 1/3 as /^j^ + + = 1- 
Therefore, given Assumption [U3]-(i) and the primitive condition on £7, Assumption [U4]-(ii) is 
satisfied here. 

Example 2: Interval Restrictions with Fixed Known End-Points 

Suppose the r dimensional parameter vector /3 is hypothesized to satisfy interval restrictions 
I < (3 < u, where I and u are numerically specified. In this case, p — 2r and p — {{(3—1)', [u-p)')'. 
An estimator (3 is available such that y/T{ (3 — (3) is asymptotically normal with variance f2 whose 
smallest eigenvalue is assumed primitively to be bounded away from zero over all G G F. Note 
that V — C^lC where G' — [Ir,—Ir\- Thus, C'Ad{p) is the r dimensional vector whose jth 
element is 

[1{I3^ < Ij} + ^{0)1{(3, - - > ^j} + *(0)l{/3, = uj}]e,+r (C.2) 
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for j < r. We consider the following two cases of interval hypotheses. 
Case I : All hypothesized intervals are non-degenerate 

For Case I, the null hypothesis concerns only non-degenerate intervals in the sense that uj > Ij 
for all j < r. 

Verification of [D2] and [U4]-(ii) for null hypothesis given by Case I : Note that 
under Hi, /S^ < Ij or (3j > Uj for some j < r and thus (jC.2p is cither 9j or —9j+r for some 
j < r. Hence C"Ac?(/i) is non-zero and Assumption [D2] holds under the alternative hypothesis. 
We need to further show that C'Ad{fi) is not equal to zero for non-zero d{fi) under the null 
hypothesis. But under Hq, (jC.2[) simplifies to 

*(0) [l{/3, = l,}0, - l{/3^. = u,}0,+r] . (C.3) 

for all j < r. Given that uj > Ij for all j, there is some j such that expression (|C.3I) equals either 
^'(O)0j or — ^(O)0j+r whenever is non-zero under the null hypothesis. Hence, Assumption 
[D2] is verified. 

We now verify the high level assumption [U4]-(ii). Under the null hypothesis, the jth element 
of C'Ad^ifi) is 

*(0)[l{/j +(j>l3j> lj}ej - l{uj > > Uj - a}ej+r\. (C.4) 

For a < vmij^^i 2^....r}iuj — lj)/2, if da-{fi) is a non-zero, then there is some j such that expression 
(|C.4I) equals either ^(O)0j or — \I'(O)0j+r and thus CAdc^in) is a non-zero vector of length which 
is bounded away from zero by Assumption [U3]-(i). Given the primitive eigenvalue assumption 
on n, this completes verification of Assumption [U4]-(ii). 

Case II : At least one hypothesized interval is degenerate 

For Case II, at least one interval is specified to be degenerate (i.e. Ij — Uj for some j < r) in the 
null hypothesis. Let Se denote the subset of {1,2, such that Ij = Uj holds for all j E Se 

but Ij < Uj for all j ^ Se- 

Verification of [D2] and [U4]-(ii) for null hypothesis given by Case II : Under Hi, 
Assumption [D2] holds by the same arguments as given in Case I. Under Hq, (jC.3|) becomes 
^(0) {9j — 9j+r) for all j e Se- In this case. Assumption [D2] still holds but the restriction that 
9j 7^ 9j-f-r for at least one j G Se has to be imposed. This extra restriction guarantees that 
C'Ad{fj.) is not equal to zero for all non-zero d{fi) and thus [D2] is fulfilled. 

We now verify the high level assumption [U4]-(ii). Note that [U4]-(ii) only concerns the 
null hypothesis under which (jC.4|) becomes ^'(0) {9j — 9j+r) for all j E Se- Therefore, provided 
that there is one j E Se such that \9j — 9j-\-r\ is bounded away from zero over all G E Tq, 
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then C'Ada^(fi) is also a non-zero vector of length which is bounded away from zero. Given the 
primitive condition on fl, Assumption [U4]-(ii) is thus satisfied for any cr > 0. 

We now comment on testing interval hypothesis of the Case II type within the framework of 
this paper. For validity of the test, it suffices to choose any single equality hypothesis indexed 
hy h Cz Se and specify 9^ 7^ dh+r at the outset. This single asymmetry requirement is the only 
operational difference compared with Case I. Moreover, since v^.h = Vh+r,h+r where v^.h denotes 
the h-th diagonal element of V , weighting inversely proportional to standard error is not ruled 
out. The user can indeed set 

dh+r = (1 + £)0h with 9h = I/V^mT: e> ~1 and e ^ 0. (C.5) 

Here £ is a non-stochastic quantity chosen by the user to control the degree of deviation from 
perfect standardization of the estimate fih+r- The weighting scheme (IC.Sp ensures that the test 
has exact asymptotic size in the uniform sense and is consistent against all fixed alternatives. 
On the other hand, Theorem U suggests that the user can specify e < (or reverse) to attach 
more (or less) weight to detection of violation of Hq in the direction of < Ih- 

Note that asymmetric weighting (|C.5|) adopted here can be viewed as "perturbing" both 
Qi and Q2 from the values they would take under symmetry. One might think to perturb 
only Q2 to ensure that singularity does not cause division by (near) zero. For example, one 
could perturb V in the expression (j3.10p defining Q2 in a manner akin to Andrews and Barwick 
(2012) who adjust the QLR test statistic by perturbing V with a diagonal matrix when the 
determinant of the correlation matrix induced by V is smaller than some pre-specified threshold. 
This alternative approach can allow for symmetric weighting. However unperturbed Qi will 
asymptotically converge to zero and hence rejection probability will tend to zero under the null 
and local alternative scenarios where all non-degenerate interval inequalities are non-binding. 
By contrast, the procedure (IC.Sp perturbing both Qi and Q2 in a balanced way ensures that 
the ratio Q1/Q2 stays asymptotically standard normal in the null even when the only binding 
constraints are the equality hypotheses. It thus enables non-zero test power to be retained in 
the aforementioned scenarios of local alternatives. 

Example 3: Interval Restrictions with Unknown End-Points 

In Example 2, testing the inequalities I < l3 < u was performed on fixed known interval end- 
points. Suppose now that I and u are not known but are parameters which satisfy I < u and can 
take a continuum of values including those which make [u — l) arbitrarily close to zero as well as 
precisely zero. There is no point estimator for /3 but consistent estimators / and u are available 
having joint asymptotic normal distribution with variance matrix f2. This, for the univariate 
case, is the scenario considered by Imbens and Manski (2004) and Stoye (2009). For clarity, we 
stay with the setup where /3 is a scalar. We consider testing Hq : I < (3^ < u for a numerically 
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specified candidate value /3q for /3. We then take /i = (/3q — l,u — /3q)' and /i — (/3q ~ l,u — Pq)' . 
The asymptotic distribution of VT{j1 — fi) is normal with variance 

For any given I and it, there is no reason why V should be singular. However, Stoye (2009. p. 
1304, Lemma 3) demonstrates that, if one insists on P{u > I) — 1 holding over the underlying 
data generating distribution space where the difference (u — I) is bounded away from infinity and 
the elements On and bounded away from zero and infinity, then V necessarily depends on 
(m — I) in such a way that Q12 — fin — > and O22 — f^ii — > as u — I — > 0. Thus, singularity 
of V where flu — 5122 = ^^12 must be allowed for. 

Verification of [D2] and [U4]-(ii) : For Assumption [D2], note that under the maintained 
assumption that I < u, the vector d(/i) can be non-zero only if it takes one of the following 
forms: (1,0)', (0,1)', (*(0),0)', (0,^'(0))', (^-(0), *(0))'. The first four of these cannot make 
VAd{ii) = 0. The last form can only occur when I = I3q — u in which case we have 

VAdifi) = *(O)[0ir!ii - 02^^12, -^l51l2 + ^25^22]'. (C.6) 

Note that (|C.6|) is zero only if V is singular and 0i/92 = rii2/riii = fl22/fli2- Singularity occurs 
in Stoye's scenario where the model allows for flu ~ Q22 ~ ^V2- Since the weights Q\ and Q2 are 
chosen by the user, we can use Q\ = l/V^^ii and Q2 = (l-|-e)/'\/^22 where e is a pre-specified non- 
stochastic and non-zero quantity satisfying e > — 1. Then Assumptions [D2] holds regardless of 
singularity of V. For Assumption [U4]-(ii), we only need to consider the null hypothesis. In this 
case, the possible forms of non-zero d^{fi) can take are (*(0), 0)', (0, ^'(0))' and (*(0), *(0))'. It 
is easily seen that daifJ.)' AV Ada{n) equals \I'(0)^ for the first, ^'(0)^(1 + e)^ for the second, and 
^-(0)2 [e2 + 2(1 + e)(l - fli2/\/fWl^)] for the third form. Hence Assumption [U4]-(ii) holds. 

In this example, the weights Oi and 62 are chosenly asymmetrically and setting e to be greater 
(smaller) than zero amounts to attaching more (or less) weight to detection of violation of Hq 
in the direction u < /3q. The e-perturbation arguments adopted here are indeed based on those 
given in Case II of Example 2. The value of the perturbation parameter e is a user's input to the 
test procedure. The choice does not affect validity of the results concerning asymptotic test size 
and consistency. Asymmetry does affect local power but, by the same device, offers the user an 
opportunity to input a subjective assessment of the relative importance of different directions of 
violation of the null hypothesis. 
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